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ABSTRACT 



This thesis demonstrates that Youth Attitude Tracking Study 
(YATS) data can be used to create a synthetic AFQT classification 
procedure for distinguishing high quality respondents. Unlike 
previous methods, the procedure does not rely on interest in the 
military to predict AFQT category. The estimates are based on an 
analysis of the YATS data matched with the Defense Manpower Data 
Center cohort data file using a binomial logistic regression model. 
The market segment analyzed is 17 to 21 year old males who are 
either high school graduates or prospective high school graduates. 
The dependent variable is whether or not a respondent would score 
above the fiftieth percentile on the Armed Forces Qualification 
Test. The explanatory variables reflect individual demographic, 
educational and labor market characteristics at the time of YATS 
interview. The YATS time frame is restricted to 1983 through 1985 
in order to facilitate future bridging of YATS models with models 
estimated with similar time period data from the National 
Longitudinal Survey of Youth (NLSY) . Additionally, the models may 
be used to provide estimates of AFQT quality for more recent YATS 
respondents . 
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INTRODUCTION AND BACKGROUND 



I . 

The purpose of this thesis is to describe a method for 
estimating mental quality from Youth Attitude Tracking Study (YATS) 
data . 

Today a new labor market confronts military manpower policy 
makers. Perhaps the most positive factor contributing to military 
labor demand is that a smaller military will require fewer 
enlistments . Enlistment requirements will decrease during the 
scheduled Department of Defense draw down and thereby help 
negotiate the anticipated trough in the youth labor market of the 
mid 1990'^ s. Current indications are that some or all of the 
services may increase mental aptitude requirements in an effort to 
maximize the productivity of the reduced personnel structure. Such 
management decisions would dictate optimum distribution of 
recruiters across geographic areas that would obtain higher quality 
recruits and probably establish new demographic racial, ethnic and 
gender equilibria. A compelling task still awaits the recruiters: 
that of continuing to enlist high quality volunteer recruits at the 
least possible cost in the wake of a massive reduction both in the 
size of the force and recruiting resources. Mass reductions 
threaten to translate to a decreased sense of military job security 
for anyone considering enlistment . 
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The supply side of the 1990' s military manpower market differs 
markedly from those of the past for several reasons . The qualified 
military available (QMA) pool is shrinking as the baby boomer 
generation ages; the pool will again begin to grow in the second 
half of the decade as the boomers' children enter the market. The 
competitive civilian labor force continues to lure many qualified 
youths away from prospective military service . Military 
advertising and recruiting costs continue to grow due to plants 
operational and manpower costs at a time when Congress can be 
expected to oppose generous recruiting budgets. 

Average achievement scores of those in the youth labor market 
continue to decline^ requiring increased selectivity. The 
announcement in September^ 1991, by the Scholastic Aptitude Test 
administrators that overall scores had dropped again sparked yet 
another nation-wide round of accusations and hand-wringing in the 
news. Meanwhile, the military services must get by with whatever 
quality they can extract from those available. The requirements 
for higher education that enable youth to obtain satisfying and 
lucrative careers compel many of the highest mental quality 
individuals to seek a college or even post graduate degree before 
even entering the labor market beyond part-time, school or summer 
employment . 

Finally, tomorrow's QMA population may grow skeptical toward 
the likelihood of obtaining a satisfactory and promising career in 
the military if vignettes of those service people discharged during 
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the early 1990' s reduction in force become human interest staples 
on television. 

Many pertinent factors on the demand side of the market also 
make future recruiting more challenging. Manpower aptitude 
requirements continue to increase due to the high technology 
weapons, engineering, and communications systems employed by the 
military. A logical screening measure to meet increased aptitude 
needs during and after the draw down would be to increase category 
I through IIIA Armed Forces Qualification Test (AFQT) entrance 
requirements. The 1989 percentages of enlistees, category IIIA and 
above are presented in Table 1 . 



TABLE 1. — SCORES OF 1989 NON-PRIOR SERVICE (NPS) ACCESSIONS BY 

SERVICE^ (IN PERCENT) 



AFQT 

CATEGORY 


ARMY 


NAVY 


MARINE 

CORPS 


AIR 

FORCE 


DOD 1 

AVERAGE 1 


I-IIIA 


63 


59 


66 


85 


65 


IIIB 


31 


31 


33 


15 


1 29 


IV 


7 


11 




fe 


6 


TOTAL 


100 


100 


100 


100 


100 



Increasing entrance standards would inevitaJbly restrict the 
number of entrants from disadvantaged educational backgrounds; 
i.e., the proportion of minorities entering the military would fall 



^For further discussion on demographic characteristics of 
actual enlistees for FY 1989, see Population Representation in 
the Military Services, Fiscal Y'ear 198 9 , Office of the 
Assistant Secretary of Defense (Force Management and 
Personnel) of July 1990. 
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drastically.^ Evolving expected threat scenarios portend changing 
missions and, hence, active duty force structure instability for 
the foreseeable future. The active duty versus reserve manpower 
mix can also be expected to change several times during the next 
generation as Defense Department planners seek a new active /reserve 
eqpiilibrium. In short, despite any savings to be realized from the 
peace dividend, the force structures of individual services may 
change. Through all of these challenges the U.S. military must 
maintain maximum manpower readiness to respond to as yet unforeseen 
crises around the world with state of the art military technology. 

All of the above supply and demand factors affect the 
recruiting force's mission and cannot be ignored when considering 
the military manpower environment of the 1990' s and beyond. 
Recruiting must become more efficient than ever. Finding recruits 
at the lowest cost per recruit will remain at the heart of the 
recruiting service's mission. 

Cost per recruit may be examined by investigating total 
recruiting costs, individual costs, and cost by categories of 
recruits . Central to minimizing recruiting costs in all three of 
the above measures is targeting qualified youth who are most likely 
to be interested in joining the military and to convert their 
interest into enlistment. Individuals are considered qualified 
military available (QMA) if they are 17 to 21 years of age, are 



^Sixty percent of all blacks who enlisted in the military 
in 1989 were categorized as II IB or IV; i.e., they would have 
been ineligible under a I through IIIA only criteria. 
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high school graduates or equivalent, pass the armed forces medical 
physical examination, pass a computerized legal records background 
check, and score in the upper fiftieth percentile of the AFQT . 

The basic goal of the recruiting service is to induct the 
number of qualified recruits necessary to perform the tasks 
required by today's and tomorrow's military systems. The goal must 
be achieved by an ongoing iterative process ; identify those who 
are QMA, target them for recruitment, and induce them to enlist. 
Since some people are more inclined (or at least susceptible) to 
entering military service than others, it is to the recruiter's 
advantage to learn which QMA' s are most likely to enter, where they 
are located, and then to target them specifically for recruitment. 

Still, a significant percentage of disinterested youth 
eventually enlist and therefore must be actively sought out. 
Identifying who is mentally qualified and where they are is the 
first step. From the individual's perspective, choosing a job„ 
even if it may last only two years, is a matter of personal well- 
being. The decision encompasses far more than simply comparing 
military wages and benefits to those of alternative civilian jobs; 
the nonpecuniary aspects of the two alternatives make it a matter 
of taste as well. Some people assign high intrinsic value to 
serving their country. Some cite reasons such as travel 
opportunities or friendships . Others may view the military as 
their best chance at overcoming socioeconomic barriers in their 
local area. If the recruiter could be armed with good indicators 
(predictors) of individual taste and ability, QMA youth might be 
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recruited at lower cost; the better the predictor, the better the 
recruiter's results. Unfortunately, concrete data do not exist 
that will tell in advance exactly who is qualified, where to send 
recruiters, or even who will or will not enlist once contacted. 
Intrinsically, the enlistment decision is fraught with individual 
tastes that may either magnify or dampen more quantif icJDle measures 
such as earnings comparisons . 

Consequently, manpower analysts must often employ survey data 
in which cause-and-ef feet relationships are often tenuous at best. 
The manner or order in which survey questions are presented to 
respondents may introduce bias. Even the selection of the 
respondents may result in biased feedback. To obtain all of the 
variables necessary for the estimates desired, analysts may find it 
necessary to match or merge different data sets in order to capture 
both pecuniary and nonpecuniary factors . From the resultant 
"complete" data set the analyst hopes, with the aid of hindsight 
obtained from historical records, to document trends which can help 
predict, for example, where future QMA' s with similar 
characteristics or profiles can be found, and based on their survey 
responses, what they will be inclined to decide. 

Two data sets on interest in the military which have proven 
useful to manpower analysts are the National Longitudinal Survey of 
Youth (NLSY) and the Youth Attitude Tracking Study (YATS) . The 
NLSY data set was initiated in 197 9 to study the labor force 
behavior of American youth. The NLSY data are weighted to 
compensate for unec[ual probability of selection. The weights are 
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adjusted annually for respondents who drop out of the survey and 
the changing population represented by the sample (Bock and Moore 
1984) . The NLSY initially included 12,686 respondents (NLSY 
Documentation) . Unfortunately, the 1979 cohort has aged 12 years 
and more current data would be desirable. Military interest 
questions were dropped after 1985. These two data sets are 
discussed more fully in Chapter III. 

The YATS data set contains more than 300 variables on personal 
traits encompassing family background, education, employment, and 
interest in military service. YATS data do not include regional or 
local demographic, education, or labor force experience variables, 
but the study does contain responses about the respondents' 
perceptions of the job market. The data are collected annually by 
the Department of Defense using telephone surveys of a sample of 
American residents between 16 and 24 years of age without prior 
military service and who have less than two years of college 
experience. The respondents are segmented into four groups; 16 to 
22 and 22 to 24 years old for both males and females. If the 
respondents provide their social security number, their responses 
can be matched with personnel files from the Defense Manpower Data 
Center to determine whether the respondent actually enlisted, 
entered the Delayed Entry Program, or completed an entrance 
examination . 

A research plan for comparing interest in the military as 
measured by YATS and the NLSY would be as follows; using 
multinomial logistic regression, predict the probability of 
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enlistment by analyzing YATS matched data using techniques similar 
to those used by Thomas and Gorman (1991) on NLSY data. The 
similarly constructed regression models should render results that 
facilitate comparing the efficacy of the two surveys in predicting 
qualification and probability of enlistment. 

A comparison of the enlistment predictive capabilities of YATS 
and NLSY data sets should yield several benefits. Since the data 
sets arise from different surveys they could logically be suspected 
of offering different insights into patterns of enlistment as well 
as other areas of interest to manpower analysts. One set may offer 
more predictive ability than the other in certain respects. A 
possible weakness of NLSY is that as its cohort ages the responses 
may no longer be representative of current and future youth in the 
17 to 21 age category. YATS, on the other hand, gathers new data 
each year. One of its weaknesses is that AFQT scores are available 
for relatively few respondents because of optional social security 
number disclosure. The 40 percent of respondents who do not 
provide a social security number inject selectivity bias into the 
sample in that they cannot be matched to future enlistment actions. 
All NLSY respondents provided their social security numbers and 
took the Armed Services Vocational Aptitude Battery (ASVAB) . 

Exploiting the AFQT' s of the NLSY and the annual sampling of 
data of the prime market by YATS may significantly improve a 
particular model's predictive ability. If YATS data produce the 
same results using a model similar to that used by Thomas and 
Gorman in 1990, the data from future YATS waves could be used in a 
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model in an effort to distribute recruiters optimally around the 
nation. The strengths and weaknesses of the two sets, once 
realized, may be exploited to reduce recruiting costs while 
increasing the number and quality of enlistees. Armed with this 
information manpower planners can more effectively assign 
recruiting goals and allocate resources to geographic areas during 
optimum time periods and economic conditions to obtain maximum 
recruiting results . 

A first step in making any YATS and NLSY comparisons is to 
develop an acceptable predictor of AFQT scores from respondent 
information in YATS. Such a proxy is necessary to partition 
respondents into appropriate market segments. The purpose of this 
thesis, therefore, is to construct models that accurately predict 
high quality AFQT prospects by exploring DMDC-matched YATS data for 
theoretically consistent explanatory variables- The models for 
White, Black and Hispanic prime market males will then be evaluated 
based on actual AFQT scores contained in the DMDC-matched data set. 
Issues of secondary interest are: "What are the surveyed interests 
of the matched QMA respondents?" and, "How does the matched sample 
set compare to the non-matched sample set?" 

Logistic regression analysis will be employed to analyze YATS, 
years 1983 and 1985.^ The analysis by Thomas and Gorman on NLSY 
was considered the control analysis. By replicating their analysis 
using matching variables when possible and proxy data when 

^See Section 1 of Chapter 3 for rationale. 
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feasible, the predictive capacities of the two data sets may b 
compared in subsequent work. 
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II. LITERATURE REVIEW 



Since World War I, manpower analysts have sought to measure the 
mental aptitudes of young males, 17 through 21 years of age. Since 
the inception of the All Volunteer Force in 1974, analysts have 
attempted to ascertain an individual's propensity to join the 
military given that he is mentally qualified. Perhaps the best 
early independent study on military labor supply and demand is 
Cooper's work in 1974. Before then remarkably little independent 
work had been done on the subject.* Earlier propensity studies 
were hampered by a lack of empirical data regarding the attitudes 
and opinions of young people toward military service. Even if 
exhaustive surveys had been available, the lingering memories of 
the Vietnam War era would probably have skewed the responses so 
that projecting estimates for future years would have been 
difficult . 

By the late 1970' s, the Vietnam specter had faded and some 
detailed manpower related data sets had been constructed. 
Computerized data analysis costs had fallen enough to encourage the 
services to embark on serious enlistment propensity studies. YATS 
and NLSY construction, the two major data sets with which this 
paper is concerned, was begun in 1976 and 1979 respectively. Many 

*Cooper, R. V. L., Military Manpower and the All- 
Volunteer Force , RAND Report Number R-1450-ARPA, RAND 
Corporation, Santa Monica, CA, 1974 
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of the major definitive works on the youth laibor market were 
developed during the 1980' s using these two data sets along with 
other sources of demographic statistics. 

Orvis (1982) determined that the YATS propensity data was a 
good predictor of an individual' s future probability of enlistment 
despite the fact that some respondents actually decide a year or 
more after participating in the study. He also determined that the 
interests and intentions solicited by the study could best predict 
actions within 18 months and, to a lesser degree, up to 48 months 
later . 

Later, Orvis (1984) showed that regional enlistment rates are 
positively correlated with regional enlistment interest. This 
finding constituted one of the earliest proofs that targeting the 
most lucrative geographic areas would be useful. 

Orvis and Gahart (1985) demonstrated that the perception of 
pecuniary factors such as job characteristics, job security, 
opportunity for advancement, and educational benefits, influences 
young people to enlist. Less obvious, but more significant to this 
thesis, they found that the probability of enlistment varied with 
surveyed intentions. Orvis and Gahart concluded that intentions 
may better predict enlistment than demographic data alone. They 
also found that respondents indicating little enlistment interest 
nevertheless constituted a large proportion of eventual enlistees. 
Other studies such as Siegel and Borack (1981) and Hanssens and 
Levien (1983) have found that interest in a particular service and 
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the likelihood of enlisting in that particular branch are 
positively correlated as well. 

Gorman and Mehay <1990) determined that interest data changes 
over time and with geography. Since intentions are not stable over 
time, older data sets may produce unreliable predictions. This 
finding is significant in that it suggests that a data set such as 
the NLSY, while valuable in many ways, has a decreasingly direct 
applicability over time. Data on interest in military service 
gathered from the 1980 NLSY cohort may not be easily substituted 
for the attitudes and opinions of the 17 to 21 year old age group 
of 1991. Combining the time variance with the geographic variance 
could turn out to be a recipe for gross errors in predictive 
capability, verifiable only after the fact. 

Prior to embarking on theoretically valid QMI studies one must 
first accurately access QMA. Since AFQT is the measure of choice 
for the American military it is imperative that the analyst first 
determine who is qualified and then focus on the resultant data set 
for propensities. The history of the art of mental aptitude and 
psychological testing is long and convoluted. Names like Sir 
Francis Galt on of England who wrote Heredity Genious in 1869, 
Alfred Binet of France and Lewis Termin who wrote after the turn of 
the century comprise a representative group of early manpower 
analysts who addressed mental aptitude. A debate on the relative 
importance of nature versus nurture arose early and has never 
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abated; that is, are people born smart or can mental aptitude be 
developed by placing the subjects in a nurturing environment?^ 

The interest of the American military in this area has been 
reflected by its entrance tests, conceived during World War I and 
eventually developed into the ASVAB. The AFQT score is derived 
from the ASVAB score.* 

Curtis, Borack and Wax (1987) first attempted to estimate 
regional QMA by clustering demographically similar counties, based 
on socioeconomic attributes that were correlated with AFQT scores, 
the major determinant of QMA. Goldberg and Goldberg (1989) and 
Orvis and Gahart (1989) found that the distributions among mental 
categories of population subgroups could be estimated. 

Thomas and Gorman (1991) examined NLSY and its large sample of 
respondents who took the ASVAB in 1980. They applied the ASVAB 
scores of NLSY to a representative sample of the youth population, 
which was estimated by Woods and Poole racial/ethnics estimates. 
Using explanatory variables that can be obtained down to the county 
level or legitimate proxies they calculated geographical 
distributions of the actual number of enlistees, i.e, qpaalified 
military joiners (QMJ) , derived sequentially from QMA and then 
those in that group interested in joining, qualified military 
interested (QMI) . 



®For a more thorough treatment, see Peterson (1990) . 

®See Eitelberg (1988) for an excellent history of 
American military entrance and placement testing. 
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This thesis seeks to develop a model similar to that of Thomas 
and Gorman using a different data set, YATS, which may facilitate 
future comparison of the predictive capabilities of the two data 
sets while recognizing the potential future applicability to 
current youth population samples captured annually by YATS. 

This thesis also seeks to build on a related work by Snyder 
(1989) . Using YATS variables Snyder devised a preliminary AFQT 
model in order to establish a "YATS prime market" for use on 
subsequent analysis of QMA distribution and propensity to enlist. 
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III. DATA AND METHODOLOGY 



A. SAMPLE DESCRIPTION 

YATS is part of the Joint Market Research Program which 
contributes to Department of Defense recruiting policy and 
marketing. Each military service may provide input through the 
Joint Market Analysis and Research Committee (JMARC) . YATS yields 
annual data about the propensity of youth to enlist in both the 
active and reserve components of the U.S. military. It also 
measures youth awareness of military advertising, contact with 
recruiters, and knowledge of the financial incentives for 
enlisting.^ Appendix A contains all YATS questions used in this 
thesis . 

The first version of the study was originally known as YATS and 
was initially collected in the spring and fall of each year. In 
1978 YATS was combined with the information from the Reserve 
Component Attitude Study (RCAS) to coalesce regular and reserve 
recruiting strategies. The spring collection effort was dropped in 
1981. In 1983 YATS and RCAS questionnaires were combined into a 
single survey and became known as YATS II. The first eight years 
of YATS saw many changes in survey questions, weighting and 
sampling. The years with which this research is concerned, 1983 



^Defense Manpower Data Center Report on YATS II Wave 15, 
Fall 1984 of April 1985 
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through 1985, saw little change with the exception of question 
numbering (variable identification) and minor question changes. A 
market redefinition of 1986 occurred after the last year of this 
thesis' sample and does not affect this work. 

The primary data for this research were contained in two files 
originally provided to the Defense Manpower Data Center (DMDC) by 
two® contract survey firms. The files contain the 1983 YATS I and 
1985 YATS II surveys. These two years of survey data were matched 
by the respondents' social security numbers to personnel data files 
held by DMDC. YATS records of respondents who provided their 
social security numbers during their telephone interview were 
matched against social security numbers contained in the DMDC data 
set . There exists an a priori selectivity bias with the YATS 
matched data sets because DMDC is unable to match files to those 
respondents who did not provide their social security numbers. 
DMDC generates a record for each individual who takes a pre- 
enlistment examination, enters the Delayed Entry Program, or 
actually enlists in the armed forces. The DMDC-matched data sets 
include all YATS respondents' records whether or not they provide 
a social security number. DMDC appends DMDC data records to the 
YATS records only if there exists a social security number match. 

The YATS calling strategy is based on a two-stage procedure 
known as Mitof sky/Waksberg random digit dialing. The first stage 
clusters households identified by the first eight digits of a ten- 

®The YATS survey contract company changed in 1984. 
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digit telephone number. The second stage uses a random selection 
of the last two digits . The Mitof sky/Waksberg procedure is 
modified to accommodate geographic breakdown according to servicing 
military entrance processing stations (MEPS) and differing sampling 
rates for the market segments. To prevent the likelihood of 
calling the Scime home twice the sampling is done "without 
replacement."® Market segment stratification is 50 percent younger 
male, 30 percent younger female, 10 percent older male, and 10 
percent older female. Weighting was not a factor in this work 
since only one market segment was analyzed according to racial 
group. Total yearly sample goal is about 10,000.^° 

The years 1983, 1984, and 1985 were specifically chosen because 
YATS began asking questions similar to those asked in the NLSY in 
1983. NLSY stopped asking military propensity questions after 
1985/ the cohort had aged beyond the prime market age parameters by 
then. The 1984 YATS was omitted from consideration from this 
research because there were 2,060 observations missing from the 
original 10,000-record data set, leaving only 54 matched records. 
The new contract company experienced difficulty with the 1984 
records that stemmed from social security number data problems. 
Typically 500 to 800 of those respondents who provide social 

®Waksberg, 1978, pp . 40-45 and Snyder p. 14. 

^°For an excellent summary of YATS constraints, 
requirements, and minimum specifications, see Snyder, pp . 12- 
15. 

^^Determined during a telephone conversation between 
author and Ms. Elaine Sellman of DMDC on 7 October 1991. 
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security numbers of the total 10,000 experience some later contact 
with the Department of Defense. With only 54 of the remaining 
7, 940 1984 respondents showing contact, there likely exists some 
selectivity bias for that year. 

DMDC first matched its files with YATS in 1989. The latest 
recorded enlistment actions happened in September 1988; that date 
is late enough to encompass the vast majority of enlistment 
decisions of the most recent sample of this study, 1985. The 
youngest respondents were 17 in 1985 and would have aged four years 
to 21 by 1989, still in the prime market. 

YATS I survey questions were recoded in 1984. A variable- 
matching procedure was necessary in this analysis to retain the 
1983 to 1985 time frame integrity and sample size. All 1983 
variables were renumbered to match 1985 variable names. 

B. SAMPLE REDUCTION 

The goal of this thesis was to predict the AFQT of the DMDC 
file using YATS survey results for prime market males. The prime 
market for men is defined as upper fiftieth percentile AFQT males, 
17 to 21 years of age, who possess high school diplomas. This work 
was aimed at males only. Most recruiting is directed at prime 
market males because the other markets are relatively self- 
recruiting; applicants of those categories generally apply for 
enlistment in more than sufficient numbers to meet current goals. 

The initial data set was reduced to include only those in the 
male primary market, at both YATS interview date and ASVAB test 
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date, and those who had provided social security numbers to 
facilitate future AFQT feedback. Primarily the respondent's 
individual survey information was used to perform the reduction; 
the only DMDC-matched filter employed was the respondent's age at 
his ASVAB test date. If a respondent delayed taking the ASVAB 
until after his twenty-second birthday he was no longer a prime 
market candidate. Historically, 60 percent of YATS respondents 
provide their social security niambers . The other 40 percent either 
do not yet have a nxamber, do not know it, or decline to provide it. 

There were a total of 17,378 observations for YATS 1983 and 
1985. Of these, 1,552 observations match DMDC files. Table 2 
presents the comparison of matched and non-matched observations for 
both years . 



TABLE 2. — DELETION OF UNMATCHED RECORDS 





1983 


1985 


TOTAL 


TOTAL 


7,419 


9, 959 


17,378 


■ MATCHED 


698 


854 


1,552 


UNMATCHED 


6, 721 


9, 105 


15,826 


PCT MATCHED 


9.4 


8 . 6 


CO 



Gender (Q402) was used to eliminate all females. Of the 1,552 
matched observations, only 156 were female. Table 3 portrays the 
deletion of females from the target sample resulting in 1,396 
remaining observations. 
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TABLE 3. — DELETION OF MATCHED RECORDS OF FEMALES 





MATCHED 


UNMATCHED 


TOTAL ^ 


MALES 


1, 396 


11,365 


12,761 


FEMALES 


156 


4, 461 


4,617 


TOTAL 


1,552 


15, 826 


17,378 



Age (Q403) was used to eliminate men not of prime market age. 
Of the 1,396 male observations, 402 were either 16 or 22 and older. 
This filter leaves 994 observations in the target sample as shown 
in Table 4 . 



TABLE 4. — DELETION OF 16 AND 22/ABOVE AGE GROUP RECORDS 



j 


16 


17 TO 21 


22 TO 29 


TOTAL 


2 

j MATCHED 

1 RECORDS 


286 


994 


116 


1,396 1 


H TOTAL 


3, 415 


11,630 


2,333 


17,378 



The 119 matched male respondents who delayed taking the ASVAB 
until they had aged beyond the prime market were also deleted, 
leaving 875 observations. 

Other reductions were more subtle. Identifying eligible 
respondents still in high school reqpiired several steps. First, 
non-high school graduates (Q4 06) , 19 or older, not taking high 
school classes at a regular day school (Q406 and Q407) were 
deleted. This step deleted drop-outs and certificate holders: 
adult basic education (ABE) and general education (GED) . If the 
respondent answered "none" to the types of degrees he had received 
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and stated that he was not or would not be enrolled in school the 



following year, he was classified as a non-graduate. This step 
removed 152 more observations, leaving 723 in the target sample. 
Table 5 depicts the results of this step. 



TABLE 5. — DELETION OF RECORDS WITH NON-HIGH SCHOOL DIPLOMAS 



TOTAL 


NON-HS GRADS 


HS GRADUATES & 
PROSPECTIVE GRADS 


875 


152 


723 



While care must be taken not to include non— graduates in the 
sample, prospective high school graduates should be retained. Many 
of the remaining respondents were still in school and, therefore, 
potential high school graduates. This reduction step attempted to 
identify whom could be considered to be likely future high school 
graduates and included in the male prime market analysis. Those 
still in high school but likely to graduate are of more interest to 
the recruiters than any other group. They are easily located and 
usually have not yet made career decisions . If the respondent 
replied that he would be in school and that the type of school 
program was a regular day high school, he was considered to be in 
school and a potential high school diploma graduate. These 
respondents were identified with questions Q700, Q698 and Q699. 

High school graduates who probably would not receive a high 
school diploma were then eliminated. If a respondent stated his 
high school grades were C' s and D's or D's and F's (69 and below 
average) and had never taken and did not plan to take a college 
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entrance test he was not considered a prospective high school 
graduate. This final reduction step excluded only 57 observations, 
leaving 666 observations for final AFQT analysis. Table 6 
compares, by racial category, the number of respondents still in 
high school to the number of respondents who already graduated. 



TABLE 6. — IN HIGH SCHOOL VERSUS OUT OF SCHOOL 





WHITE 


BLACK 


HISPANIC 


TOTAL 


STILL IN 
HIGH SCH 


253 


54 


26 


333 


HIGH SCH 
GRADS 


254 


55 


24 


333 


TOTAL 


507 


109 


50 


666 



For DMDC's purposes, a YATS respondent can have four basic 
types of contact with the military. The respondent either takes 
the ASVAB only, takes the ASVAB and enters the Delayed Entry 
Program, is discharged from the DEP, or takes the ASVAB and 
immediately enlists in the military. Those who provided their 
social security numbers but who never took the ASVAB were recoded 
from zero to a missing value. Respondents who failed to provide 
their social security number were already coded as a missing value, 
whether or not they ever took the test because there is no way to 
match their records . These last two groups constitute the non- 
matched sample set in this analysis . Teible 7 shows the military 
contact distribution by race of the selected matched sample 666 
observation set. 
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TABLE 7. MILITARY CONTACT BY RACE 





WHITE 


BLACK 


HISPANIC 


TOTAL 


TOOK AFQT 


260 


50 


26 


336 


ENTERED DEP 


23 


4 


1 


28 


ENLISTED 


222 


55 


22 


299 


DISCH FROM 
! DEP 


2 


0 


1 


3 


1 TOTAL 


i 507 


109 


50 


666 



C . DEPENDENT VARIABLE 

AFQT percentile (AFQTPCT) from the YATS matched respondents was 
used to develop the dependent variable. Conversion to raw scores 
was not necessary since the AFQT percentile score was used to 
determine whether a respondent scored in the upper or lower half of 
all respondents. If a respondent scored in the fiftieth 
percentile or higher the respondent was categorized as high quality 
(HQ=1) . If a respondent scored lower than the fiftieth percentile 
the respondent was categorized as non-high quality (HQ=0) . The 
HQ/Non-HQ distribution for actual matched AFQT scores is presented 
in Table 8 . 
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TABLE 8. — QUALITY DISTRIBUTIONS OF OBSERVATIONS 





WHITE 


BLACK 


HISPANIC 


TOTAL 


I ^ 


507 


109 


50 


666 


HIGH 

[ QUALITY 


.59 


.21 


.38 


.51 


NON-HIGH 

QUALITY 


.41 


.79 


. 62 


. 49 


TOTAL 


1 


1 


1 


1 



D . METHODOLOGY 

The DoD prefers to accept recruits from the youth population 
that score above the fiftieth percentile of the AFQT . The upper 
half mental groups are defined as mental categories I, II and IIIA. 
As stated earlier, this research focused on 17 to 21 year old males 
who were either high school graduates or prospective graduates. 
This group was selected because it is more supply constrained than 
the other demographic groups and, therefore, of more interest to 
the military. The target sample was partitioned into three 
demographic groups consistent with generally accepted DoD 
demographic categories: White, Black and Hispanic. Included in 
the White category were minorities other than Black and Hispanic, 
such as Oriental and Pacific Islanders, of which there were 20 
observations in the matched set . 

This research uses binomial logistic regression, often referred 
to as a logit model, to develop models that accurately predict 
whether a respondent was a high quality (HQ) prospect or a non— high 
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quality (non-HQ) prospect. Binomial, or binary, logistic 

regression simply means the dependent variable is dichotomous, 
i.e., true/false or yes/no. The logistic functional form restricts 
the probabilities to the range of zero to one. 

P(HQ) = In [P/(1-P)J = (a + + . . . + B„XJ + u 

where : 

P (HQ) = probability that the respondent was a high quality 
prospect 

a = intercept term 

Bi to n = coefficients as estimated by the model 
^1 to n = YATS explanatory variables 
u = randomly distributed error term 
In this work the dichotomous dependent variable was coded 1 for HQ 
and 0 for non-HQ. 

There was no restriction on the types of explanatory variables; 
they could be continuous, categorical, or both. Theory and 
experience show that characteristics such as age, gender, race, 
education and socioeconomic status are highly correlated with 
mental test achievement. This research concentrated on YATS 

variables that best captured the effects of the above categories 
and accepted some multicollinearity among explanatory variables 
with the goal of achieving increased predictive ability of the 
estimated equations. 

An application of the analysis was to estimate the AFQT 
percentiles of unmatched records, i.e., the rest of the population, 
and to compare the relative mental achievement of the two saonple 
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sets. This step entails running the estimated logistic regression 
equations from the three target samples, White, Black and Hispanic, 
against the entire YATS populations of 17 to 21 year old male high 
school graduates and prospective graduates . The AFQT categories 
were estimated using the SAS (OUTP=PRED) option to calculate 
probabilities for missing dependent variable values of the non- 
matched records. The high quality cut-off point was a predicted 
probability greater than or equal to .5. Those observations with 
estimated HQ probability less than .5 were categorized as non— high 
quality . 

E . EXPLANATORY VARIABLES 

In a good econometric model specification bias and errors must 
be minimized. The effects of ethnic, cultural, economic and social 
contributors are difficult to quantify and must often be measured 
by proxy measures . Proper choice of relevant explanatory variables 
and omission of irrelevant variables are imperative. 

Explanatory variables for this research were drawn only from 
the YATS survey. Only as YATS respondents take the ASVAB, which 
can be considerably later than the YATS interview, can they be 
matched to DMDC files. DMDC-matched YATS data becomes available 
one to four years after the survey. Enlistment may be the result 
of numerous factors not captured by the data, such as labor force 
changes. Therefore, using the DMDC-matched information may 
reasonably be viewed as adding bias to data al.ready rife with 
selectivity biases . 
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YATS data is subject to selectivity bias in several ways. As 
mentioned earlier, the voluntary social security number disclosure 
on the part of respondents may well bias all DMDC-matched data, a 
possibly lucrative subject to explore if YATS is ever to be bridged 
to pooled data sets such as NLSY. The interest questions may cause 
multiple and hard-to-measure biases. By definition, those who 
eventually contact the military to be ASVAB tested turn out to be 
interested in military service, whether or not they stated so 
during their interview. Simply being asked military interest 
questions may pique a respondent's curiosity, which may later 
crystalize into contact with the military. Also, despite the AFQT 
bridge designed by RAND, the analyst can never know for certain 
just how accurate the bridge really is; there is no way to test it 
against respondents who have never taken the ASVAB or, even if they 
do, they do not disclose their social security number in the first 
place. Still, despite the inherent biases, YATS may well be the 
military' s most current data set and lowest cost opportunity to 
locate high quality QMA and QMI; but, the bridging riddle must 
first be solved before the data can be deemed reliable enough to 
make sound, low-risk regional QMA decisions. 

YATS questions that can potentially discern a respondent's 
aptitude may be categorized as demographic, educational and labor 
market status. Appendix A contains a detailed description of both 
YATS and cohort questions ultimately considered in this analysis. 
Table 9 lists the variable names used in this thesis, their 
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corresponding survey questions 
values . 



and each variable's possible 



TABLE 9. — LIST AND DESCRIPTION OF EXPLANATORY VARIABLES CHOSEN 



VARIABLE 

NAME 


YATS 

QUESTION 


VARIABLE 

RANGE 


WHITE 
(+ other) 


Q714 (1+3+4) 


1=YES, 0=NO 


BLACK 


Q714 (2) 


1=YES, 0=NO 


HISPANIC 


Q715 


1=YES, 0=NO 


YATSAGE 


Q403 


17 - 21 


SATPAST 


Q698 


1=YES, 0=NO 


HOMED 


Q713M 


7-20 


GRADE 


Q700 


1-7 


HGCOMP 


Q404 


7-14 


MATHT 


Q703 (l) - Q709 (l) 
Summation of math courses higher 
than elementary algebra: plane 
geometry + intermediate algebra + 
trigonometry + calculus + physics 


0-5 


BUSMATHT 


Q705 


1=YES, 0=NO 


COLGPREP 


Q70# 


1=YES, 0=NO 


COLGBND 


Q411 (3 or 4) 


1=YES, 0=NO 


COLGSTUD 


Q408 (9 or 10) 


1=YES, 0=NO 


EMPLOYED 


Q416(l) & Q408 (l) 


1=YES, 0=NO 


UNEMPLOYED 


Q416(2) & Q408 (l) 
not employed and not enrolled 
in school 


1=YES, 0=NO 


SELF EMPL 


Q430 (3) 


1=YES, 0=NO 



Note: The letters "LN" preceding one of the above variables 
indicates natural log. The letters "SQU" following one of the 
above variables means the value of the variables squared, or raised 
to the second power. 
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1 . Demographic Explanatory Variables 



Race indicators for White (plus other) and Black were 
determined by Q714, answers one, three and four. Hispanics were 
identified with an additional filter, Q715. Respondents who stated 
that they were of Hispanic descent were considered to be Hispanic 
independent of their racial category response. 

Question Q403 was employed to determine the respondent's 
age at the time of YATS interview. 

Past studies have shown that people from low income 
families, particularly those below "the poverty level" have a 
distinct disadvantage when taking mental achievement tests . There 
was no known way to estimate a respondent's poverty status from 
YATS, so this potentially telling variable cannot be analyzed. 

Generally, mental achievement scores vary by region across 
the nation. Students from the Southeastern states, for example, 
tend to score lower than those in New England. However, analysis 
of the chosen sample contradicted past nationwide studies and 
conventional wisdom. Therefore, region was not addressed. 

2 . Education Explanatory Variables 

Some education variables did not perform as well as 
expected. The YATS analyst must keep in mind that the interview is 
conducted via telephone conversation with no expectation of 
verification or feedback. Math courses taken or planned (Q702 
through Q709) did not perform as well as simply math courses taken 
(plans omitted) . Higher math courses, i.e., those taken after 
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elementary algebra should be positively correlated to higher AFQT 
results. These courses include plane geometry (Q703) , intermediate 
algebra (Q706 ) , trigonometry (Q707) , calculus (Q708) and physics 
(Q709) . Business math was also considered as it is often taken by 
some students in order to fulfill a graduation requirement to avoid 
having to endure elementary algebra. 

The response to Q699 (Do you plan to take the SAT/ACT?) 
also did not perform well. Q698, which ascertains whether the 
respondent had already taken a college entrance test, was used 
instead and it proved to be better correlated with AFQTPCT . 

Parents' education could not be used in this analysis. 
Question Q713F, which obtains the respondent's father's highest 
grade completed was not asked until 1984, the year after this 
analysis' earliest YATS wave, 1983. Since mother's education also 
consistently performed well as a predictor, Q713M (mother's 
education) was used alone to represent parents' education. 
Mother's education was captured in both YATS I and II. 

High grade completed was determined with Q404. This 
variable was not expected to be very significant because the sample 
reduction step deleted those matched respondents who had quit high 
school before graduation. 

High school grades were estimated using Q700. This value 
should be viewed with some suspicion since the respondent may be 
motivated to inflate his grades despite the anonymity of the 
interview. 
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Finally, students who indicated that they intended to seek 
college education were identified with Q411. 

3 . Labor Market Status Explanatory Vari 2 d:>les 

Questions Q408 and Q416 were employed to determine labor 
market status. Respondents'^ labor market status was defined in 
this study as high school student, college student, employed (full- 
er part-time) , but not in school full time) , and unemployed (not 
employed and not in school full time) . 

Theoretically, variables such as local unemployment rates 
are normally considered when discerning enlistment propensity. 
These data also have some relationship to the degree to which a 
respondent's unemployment status is correlated with achievement. 
In other words, an unemployed respondent from an area where the 
youth unemployment rate is 30 percent may be more a victim of 
circumstances than of his lack of mental ability, whereas an 
unemployed respondent from an area with only five percent 
unemployment may be more likely the cause of his own unemployment 
status . 

4 . Descriptive Statistics 

Table 10 depicts the descriptive statistics of the selected 
variables for both matched and non-matched records in each racial 
category. Several relationships, both among the racial groups and 
between the matched and non-matched respondents within each group, 
were notable. 
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TABLE 10. — DATA COMPARISON OF All MALES 



MEAN 


WHITE 


BLACK 


HISPANIC 


TOTAL 




MATCH/NON 


MATCH/NON 


MATCH /NON 


MATCH/NON 


• RECORDS 


507/4,171 


109/542 


50/400 


666/5,113 1 


YATS AGE 


18 .1/18.2 


18.1/18.3 


18.2/18.3 


18.1/18.2 


AFQTPCT 


56.09/NA 


35.28/NA 


43. 62/NA 


51.75/NA 


WHITE 


1/1 


0/0 


0/0 


.76/ . 81 


BLACK 


0/0 


1/1 


0/0 


.16/. 11 


HISPANIC 


0/0 


0/0 


1/1 


.08/. 08 


SATPAST 


.55/ . 64 


.39/. 48 


.38/. 48 


.51/. 61 


MOTHER'’ S 
EDUCATION 


12.4/12.9 


12.4/12.4 


11.9/11.7 


12.4/12.7 


COLLEGE 

PREP 


. 66/ .73 


.63/. 61 


.66/. 68 


.66/. 71 


HS GRADES 


79.6/81.0 


77.4/78.2 


79.5/80.3 


79.2/80.7 


HIGH GRADE 
COMPLETED 


11 . 8/12 . 0 


11 . 8/11 . 8 


11.8/11.8 


11 . 8/11 . 9 


HIGH MATH 
TAKEN 


1.75/2.22 


1.55/1.63 


1 . 88/1 . 88 


1.73/2.12 


COLLEGE 

BOUND 


.56/ . 61 


.50/. 54 


.54/. 59 


.55/. 60 


HS STUDENT 


.20/. 30 


.38/. 32 


.38/. 30 


.33/. 31 


COLLEGE 

STUDENT 


.20/. 28 


.08/. 17 


.20/. 23 


.18/. 27 


1 EMPLOYED 


.35/. 31 


.30/. 32 


.24/. 32 


.33/. 31 


UNEMPLOYED 


.13/. 10 


.24/. 19 


. 18/ .16 


.15/. 12 


SELF EMPL 


.02/. 04 


. 02/. 03 


.00/. 03 


.02/. 04 



a. Matched Sample 

The matched sample was the set where the 
aforementioned biases most probably exist. This sample probably 
constituted a fair cross-sectional representation of the three 
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racial groups who arrived at the typical MEPS seeking enlistment. 
The mean AFQT percentiles of the three racial groups vary 
radically; Whites averaged 56 percent. Blacks 35, and Hispanics 
almost 43. Considering the long-standing trend of Blacks attending 
substandard schools along with the documented means in this sample, 
the reasons for this shortcoming were obvious. The sample's Black 
males took significantly fewer higher mathematics courses than 
White males. Blacks were enrolled full time in college at less 
than half the rate of Whites: 8 percent versus 20 percent. Still 
the average mother's education of Whites and Blacks were identical. 
The Blacks in the sample were almost twice as likely to be out of 
school and unemployed as Whites, despite the fact that all non-high 
school graduates had been filtered out of the sample. The matched 
Hispanics, on the other hand, had taken more higher math courses 
than the other two groups, were or had been enrolled in college 
preparatory curricula at the same rate as Whites, and were enrolled 
in college as the same rate as Whites. Yet their mean AFQT 
percentile lagged 13 percent below that of Whites. Two likely 
explanations exist for the above findings. Despite their 
relatively solid educational background, the language barrier may 
have held the Hispanics at a comparative disadvantage on the 
written ASVAB test. The Black males' poor showing may simply 
reflect the quality of their schooling and their often 
disadvantaged individual socio-economic backgrounds. 
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b . Non-Matched Sati^le 

The comparisons and contrasts of matched and non- 
mat ched generally expected tend to hold true. Of course, actual 
AFQT percentiles on the non-matched set were unavailable and are 
predicted later in this thesis. The RAND AFQT bridging estimator 
has not been applied to YATS records earlier than 1986. Every 
variable selected for analysis showed that the mean non-matched 
White males to be higher mental test achievers than those of the 
matched sample: higher mother's education, more enrolled in 

college preparatory high schools, more higher math courses taken, 
more college bound, more enrolled in college, etc.. The same 
contrasts hold true for Blacks, but the differences were less 
pronounced. The Hispanic group in the non-matched sample and that 
of the corresponding matched sample were almost identical, 
suggesting that more than the other two groups, the Hispanic 
community provides a more representative cross section of its young 
males to military service. 

F. DATA AND METHODOLOGY SUMMARY 

The matched sample set selected for analysis contained 666 
observations. Applying the same selection criteria to the non- 
matched population which had either withheld social security 
numbers or had no matched record or both yielded 5,113 
observations . 

Variables capturing the effects of characteristics measured by 
the YATS questions depicted in Appendix A were used in a binary 
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logistic regression analysis with the goal of obtaining a model 
with the highest possible predictive ability. Emphasis was placed 
more on theoretical consistency and predictive ability and less on 
explaining the effects of individual variables, minimizing 
multicollinearity among explanatory variables, or building a 
parsimonious model. Polynomial, exponential and logarithmic forms 
of all semi-continuous variables, e.g., mother's education (HOMED), 
high grade completed at time of interview (HGCOMP) , higher math 
courses taken (MATHT) and high school grades (GRADE) , to the 
dependent variable HQ were also explored. 

Three key questions are conspicuously missing from YATS ; "Do 
you live in an urban, suburban or rural area?" , "Have you or your 
family currently received any type of means-tested federal, state 
or local governmental assistance such as food stamps, welfare, 
etc., in the past 12 months?" and "Were you reared in a dual-parent 
family?" These questions would better identify a respondent's 
socio-economic status and help compare those surveyed by YATS to 
those queried by other surveys such as NLSY. These questions might 
also facilitate helpful comparisons with demographic studies such 
as the national census. The urban/ suburban/rural question could be 
ferreted out of YATS by using the CTYFIPS2 and zip code variables, 
but not without an inordinate cimount of painstaking effort. 
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IV. ANALYSIS 



A. GENERAL 

The logistic regression equations were estimated using the 
LOGIST procedure of SAS, version 5.16. The primary criterion for 
selecting the best model was goodness of fit as measured by the 
percentage of respondents correctly identified as high quality or 
non-high quality prospects. Theoretically consistent signs, 
parsimony and multicollinearity were also considered. 

Comparisons of the means of the dependent and explanatory 
variables among the racial groups confirm expectations that the 
target matched sample differs significantly from the non-matched 
sample. The matched sample'^s bias was expected in that it appears 
to portray accurately the profile of youth who actually seek 
military service as opposed to portraying the "average" American 
youth . 

B. LOGIT REGRESSION 

Logit regression was used to develop estimating equations for 
whether DMDC-matched YATS male respondents were high quality, AFQT 
percentile greater than or equal to 50, or non-high quality. Since 
there were only two categories, the dependent variable was 
specified as binomial. The specific market segment analyzed was 17 
to 21 year old males who were either high school graduates or 
prospective high school graduates . 
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1 . 



MODELS 



Separate models were developed for the three racial groups, 
White, Black and Hispanic. Table 11 presents results of selected 
predictive models by race. 



TABLE 11. — RESULTS OF PREDICTIVE MODELS BY RACE 




38 



ANALYSIS OF EXPLANATORY VARIABLES 



2 . 



Age at interview significantly negatively affected the 
dependent variable for both Whites and Blacks, but was not 
significant for Hispanics. Age was omitted from the final Hispanic 
model because it did not contribute to its predictive ability. The 
YATS interview itself may spark some curiosity in the military on 
the part of the respondent, which translates to some unknown degree 
of selection bias . 

Whether a respondent had taken a college entrance 
examination in the past positively affected the coefficients for 
both Whites and Blacks. YATS also asks whether the respondent 
intends to take a college entrance excimination, but that variable 
proved to be a much weaker predictor. SATPAST for Whites was 
significant only at .127. SATPAST for Blacks was not significant, 
but did contribute to the model *^3 predictive quality. 

Prevailing experience indicates that mother'^s education 
interacts very little with most other background factors and is not 
highly correlated with mathematical skills. It therefore 
promised to perform well in tandem with the mathematics experience 



^^Mother's education tends to associate most directly with 
word knowledge, paragraph comprehension, general science, 
arithmetic reasoning, and mathematics knowledge. Still it 
stands to reason that mother'^ s education is only beneficial in 
background information and is not as strong a predictor as a 
respondent's formal education. This condition is borne out by 
this analysis. An individual's math courses taken were more 
highly correlated to AFQT performance than mother's education. 
Also, see Profile of American Youth, p. 274. 
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discussed below in forming an accurate mental achievement profile. 
Mother' s education (HOMED) was positively correlated with AFQT 
performance, consistent with past studies. However, mother's 
education was not as powerful a predictor as might be expected in 
the target sample, perhaps because there was not as great a 
variance as that of the general population. The target sample's 
HOMED variable showed that young males who enlisted tended to come 
from families whose mothers did not have a college degree, no 
matter what their race. Since most youth who show interest in the 
military tend to come from lower and lower-middle class families 
there is little variance in mother's educational backgrounds. The 
variable was not significant for any of the racial segments, 
attributable to the lack of a good HOMED spread for the target 
sample. Most mothers of respondents of all three racial subgroups 
tended to fall between tenth grade and college freshmen. 

High school grades contributed to the predictive ability of 
both the White and Black models, but surprisingly little. Both 
GRADE and GRADE SQU were employed in the White model because they 
marginally improved the accuracy of the model. The GRADE variable 
transformation to the log fit best for Blacks, but still was not 
significant (.418). 

High grade completed (HGCOMP) was relatively significant in 
the White model, both as HGCOMP and HGCOMP SQ: .087 and .067. 
This was intriguing in view of the target sample chosen. Only high 
school graduates or prospective high graduates were considered in 
the analysis, so this variable varied only when the respondent was 
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still in high school or had some college experience. Experience 
shows that by the time that all respondents have moved beyond the 
prime market the HGCOMP'^s should be clustered close to the mean 
and, therefore, offer little to the model. Recall that highest 
grade completed was the value recorded at the date of the 
interview. HGCOMP did not contribute to the Black model, perhaps 
reasons similar to those cited above. Few Blacks in the sample had 
gone on to college and most were still in high school . The LN 
HGCOMP variable was included in the final Hispanic model, but was 
significant only at .175. 

Mathematics courses taken by the time of interview proved 
to be the most telling variables analyzed. One or more mathematics 
variables were included in all three race's models. For the 
purposes of this thesis, the summation of mathematics courses 
higher than elementary algebra constituted the variable MATHT . See 
Table 9 for a more complete description. For Whites, both MATHT 
and MATHT SQU were significant to .001 and .007. MATHT was also 
employed in the Black model and was significant as well, at .006. 

Some individual mathematics courses, such as plane geometry 
and business math, performed much better than others. Plane 
geometry was most highly correlated to AFQT percentile for all 
three groups, but was included as a part of the MATHT variable. 
Students who take plane geometry quite likely demonstrate 
inherently greater math aptitude as well as .receive valuable 
mathematical training from the course itself. The same may be said 
of all other higher math courses . But plane geometry proved a 
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superior predictor to other, more advanced, math courses because so 
few students who eventually sought military service took courses 
such as intermediate algebra, trigonometry, calculus or physics. 

Business mathematics was negatively correlated with AFQT 
percentile for all three racial groups, but contributed most to the 
Hispanic model: -.99 coefficient at .229 significance. Business 
math is viewed by many high school students as an easy alternative 
for fulfilling high school graduation requirements, which may 
explain its negative correlation with AFQT percentile. Students 
with less mathematical ability or confidence who took business math 
without going on to more advanced courses apparently did not gain 
the necessary mathematical reasoning skills to perform well on the 
ASVAB . 

High school curriculum confirmed a priori expectations that 
students from college preparatory curricula achieve higher scores 
on achievement tests. Respondents who answered affirmatively to 
college preparatory as opposed to business, technical or vocational 
schooling generally performed better on the ASVAB. As modeled, the 
dummy variable COLGPREP was one of the stronger coefficients, 
significant for Whites at .004. Since most students participate in 
at least nominal college preparatory curricula it was at first 
expected that there would not exist enough variance to make this 
variable a valuable predictor. This expectation proved true with 
regard to the Black model. COLGPREP also performs well for the 
Hispanic model, but with less significance, .173. A related 
variable that denoted those respondents who indicated positive 
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intentions to attend college after high school, COLGBND, 
contributed to both the White and Black models, but not well enough 
to include in the Hispanic model. 

College students logically should be expected to perform 
better on achievement tests than those not in college. The 
variable, COLGSTUD, was used as a la±>or force category, the others 
being high school students (HSSTUD) , EMPLOYED (not in school full 
time) and UNEMPLOYED (not in school full time) . As expected, 
COLGSTUD performed well for the Black and Hispanic models. 
COLGSTUD boasted the highest coefficient in both models, Hispanic 
significant to .013. Surprisingly, COLGSTUD was negative in the 
White model, but significant only to .255. It did improve the 
White model accuracy and was retained in the final model. A 
possible explanation for the coefficient's negative value in the 
White model was that some White males who began college realized 
they were not ready for college and turned to the military as an 
alternative . 

Other labor force variables served marginally well as 
predictors. Given that the respondent was not in school full time 
and was unemployed at the time of interview, UNEMPLOYED contributed 
to the White model, but was significant only at .377. 
Surprisingly, the Black UNEMPLOYED variable was positive, though 
not significant at .605. The positive valence of the Black 
UNEMPLOYED coefficient may simply reflect that Blacks who found 
getting a good job after graduation difficult eventually sought 
military service. At any rate, because it did improve the accuracy 
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of the model despite contradiction to expectations, it was 
retained. EMPLOYED contributed to the accuracy of the White model 
only . 

The type of employment, whether private employer, 
government employer, family business or farm, or self employed, was 
not a significant factor with the exception of Whites. Self 
employed White respondents showed a significant negative predictor, 
-1.6 at .043. The scores of self employed Blacks were consistent 
with the base case, HSSTUD. There were no self-employed Hispanics 
in the target sample . 

The direct, exponential and logarithmic relationships of 
all semi-continuous variables, e.g., mother's education (HOMED), 
high grade completed (HGCOMP) , higher math courses taken (MATHT) 
and high school grades (GRADE) , to the dependent variable HQ were 
explored and included if they contributed to model accuracy. 

C. APPLICATION OF MODEL RESULTS TO NON-MATCHED RESPONDENTS 

After developing the estimating equations, the two Scunples were 
then compared and contrasted. First, the AFQT (high qpaality /non- 
high quality) probabilities for the non-matched sample were 
estimated based on the estimated equations developed from the 
matched racial groups' models. Then the quality distributions were 
examined as well as how they related to interest in the military. 
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1 . iVpplication of Estimating Equations to Non-Matched Sample 
Using the OUTP (PRED) SAS option on the estimating equations 
for the matched Scunple, the results confirmed expectations that the 
general population was better qualified in QMA terms than those who 
actually took the AFQT . The high quality probsJoilities for actual 
matched AFQT scores and estimated non— matched probabilities are 
presented in Table 12 . 

TABLE 12. — COMPARISON OF MATCHED AND NON-MATCHED OBSERVATIONS 



(ACTUAL / E S T IMATED ) 





WHITE 


BLACK 


HISPANIC 


N 


507/4,113 


109/542 


50/400 


1 HIGH QUALITY 


.59/ .72 


.21/. 15 


.38/. 35 


1 NON-HIGH QUALITY 


. 41/ .28 


.79/. 85 


. 62/. 65 


1 TOTAL 


1/1 


1/1 


1/1 



The non-matched sample of Whites offered 72 percent high 
quality prospects, or QMA's, which significantly exceeded the 
matched sample of Whites, i.e., those who contacted the military in 
the form of at least taking the ASVAB at a minimum. The Black 
model revealed the opposite: 21 percent of those with military 
contact actually tested as high quality as opposed to only an 
estimated 15 percent of non-matched Blacks. The Hispanic samples 
mirrored the trend for Blacks, but at a much greater high quality 
probability: .38/. 35. The trends portrayed by these samples were 
that the military was attracting fewer high quality Whites 
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percentage-wise, but more high quality Blacks and Hispanics than 
were available in the general population. 

2 . Interest Comparisons 

An additional assessment of the AFQT model was provided by 
examining how the AFQT groupings in the estimated samples were 
distributed across interest categories. Compared with the interest 
distribution for the actual AFQT score sample. Table 13 shows that 
for the matched sample the more positive interest a White 
respondent indicated during the YATS interview, the lower the 
probability he would score as a high quality recruit . For 
instance, given that a White respondent from the matched sample 
stated that he was definitely interested in serving in the 
military, there was a .54 likelihood that he would be a high 
quality recruit prospect. The results for Blacks were somewhat 
mixed, but the general trend was the same: the more interested the 

respondent, the less likely that he would score as high quality. 
The Hispanic sample did not fit the general pattern of the other 
two racial groups. The highest probabilities lie at the two 
extremes of interest . 
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TABLE 13. — HIGH QUAJLITY PROBABILITY BY INTEREST CATEGORY: MATCHED 

VERSUS NON-MATCHED BY RACIAL GROUP 





WHITE 

M/NON-M 


BLACK 

M/NON-M 


HISPANIC 

M/NON-M 


N 


507/4,171 


109/572 


50/400 


DEFINITELY YES 


.54/. 76 


.17/. 18 


.53/. 15 


PROBABLY YES 


.57/. 64 


.24/. 09 


.26/. 22 


PROBABLY NOT 


.60/. 73 


.14/. 18 


.25/. 34 


DEFINITELY NOT 


. 61/ .73 


.23/. 18 


.50/ .44 


TOTAL 


.59/. 72 


.21/. 15 


.38/. 35 



Comparing the high quality probabilities of the two samples 
within each racial group also confirmed a priori expectations. In 
the White samples the non-matched high quality probabilities were 
always higher, indicating that the military was attracting less 
than a representative share of high quality White recruits. The 
trend was mixed for Blacks and Hispanics, suggesting that many 
minority people considered military service as an economic 
opportunity rather than a near last resort. 
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V. SUMMARY, CONCLUSIONS AND RECOMMENDATIONS 



A. SUMMARY 

Binomial logistic regression modeling was employed to develop 
estimating equations for matched YATS and DMDC files of high school 
and prospective high school graduate males, 17 to 21 years of age, 
from YATS waves of 1983 and 1985 in the racial groups White, Black, 
Hispanic. The predictive abilities of these models were then 
calculated against actual scores contained in the DMDC cohort 
files. The models were then applied to the larger unmatched 
population to estimate high quality or non-high quality categories. 
The results of both modeling evolutions were then considered in 
light of the respondents' interest in the military question. 

B. CONCLUSIONS 

YATS selectivity biases, caused by voluntary social security 
number disclosure and the fact that only a fraction of those who 
disclose their social security number ever take an ASVAB test, 
offer some intriguing challenges to analysts. It is unfortunate 
that the RAND AFQT prediction procedures for the unmatched samples 
were only applied as far back as 1986. Currently there are no 
alternative AFQT categorization of these YATS respondents with 
which to compare model results. 

Still, both matched and non-matched samples confirm 
expectations that high quality youth are less likely to seek 
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military service than non-high cpjality. Interest versus high 
quality distributions also confirmed this tendency, with expected 
deviations within minority groups . 

The analysis supports the idea that the YATS data set exhibits 
tendencies not unlike NLSY in the aggregate. This thesis 
demonstrates that YATS survey data can be used to create a 
synthetic classification procedure for distinguishing high quality 
respondents. The method used in this thesis corrects the current 
deficiency of DMDC methods that rely on interest in military to 
predict AFQT category. 

C . RECOMMENDATIONS 

1 . YATS Modifications 

Applying the RAND AFQT routines to YATS years 1983 and 1985 
may offer a basis for comparison that could facilitate building the 
desired bridge between YATS and NLSY in that the selectivity biases 
of YATS might be at least partially negated. 

Adding a YATS question on whether the respondent is a 
product of a nuclear family or a broken home is also desirable. A 
social variable such as this would provide some insight on the 
effects of divorced or single parents on enlistment and at least 
ASVAB achievement behavior. The increased numbers of single 
parents in today's society may well demand that this factor be 
considered. 

Refining the capability to capture the respondent's urban, 
suburban or rural demographic status would further aid in bridging 
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YATS to NLSY. The qpaality of education, though it may vary within 
the three categories, would offer some valuable insight into both 
interest and achievement not currently completely captured by YATS. 

A YATS question to capture some degree the respondent's 
economic status would be beneficial, for example, poverty or non- 
poverty categories by virtue of whether the respondent's family 
received a means— tested subsidy during the past 12 months. 

2 . Further Study 

Comparing and contrasting YATS 1983 through 1985 with the 
same years of NLSY are imperative if the two data sets are to be 
bridged. These years include the entire window in which both YATS 
and NLSY asked similar propensity questions. This thesis is only 
a modest first known step in that direction. The benefits, if they 
can indeed be realized, are enormous. The Department of Defense 
would have at its disposal a more current management decision 
system data set than currently exists with which to base recruiting 
force and resource allocation decisions . 

Eliminating or at least partially reconciling the 
selectivity biases of YATS offer many opportunities for further 
study. Capturing and obtaining the reasons for non-disclosure 
could make it possible to examine the profiles of non-disclosure 
groups against those with known social security numbers and those 
with matched records . Some of the biases such as social security 
number disclosure may not affect YATS analysis as adversely as one 
might suspect . 
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This thesis, as most others, addresses only those 
respondents already in the prime market . Comparing and contrasting 
the behavior of 16 year old to those of the prime market should be 
considered. If 16 year old YATS respondents'^ yields analysis 
results similar to those already in the prime market a full year of 
lead time could be realized in applying the data set to management 
decisions . 

Finally, applying these YATS results to siibsecpaent YATS 
years, particularly to 1990, and comparing the synthetic YATS 
categories would also be beneficial. The RAND AFQT estimation 
method currently employed by DMDC could be further evaluated and 
perhaps validated or improved. 
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APPENDIX A 



This is a listing of YATS questions used in the conduct, either 
Scunple reduction or analysis, of this thesis with a listing of 
possible responses . Questions are categorized using the YATS II 
numbering scheme . Cross referenced YATS I questions numbers follow 
in parentheses . 

Q402 (A2) - What is your gender? 

1 - male 

2 — female 

Q403 (A3) - What was your age on your last birthday? 

The code is the reported age . 

Range: 16—29 

Q404 (A4) - Now I have a few questions about your educational 

experiences and plans . What is the highest grade or year of school 
or college that you have completed and gotten credit for? 

7 - less than 8th grade 

8 - 8th grade 

9 - 9th grade 

10 - 10th grade 

11 - 11th grade 

12 - 12th grade 

13 - 1st year college/ junior or community college/ vocational, 

business or trade school 

14 - 2nd year college/ junior or community college/ vocational, 

business or trade school 

. - BD 

Q406 (HIDEGREE) - Do you have a regular high school diploma, a GED, 
an ABE, or some other kind of certificate (of high school 
completion) ? 

1 - regular high school diploma 

2 — adult basic education 

3 - graduate equivalency degree 

4 - Some other kind of certificate of high school equivalency 

5 - None of the above 
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Q407 (All) - (In October, will you be/Are you) enrolled in any 
school, college, vocational or technical program, apprenticeship, 
or job training course? 

1 - yes 

2 - no 

Q408 (A12) - What kind of school or training program (will you 

be/are you enrolled in? 

1 - no schools or training programs 

2 - ABE 

3 - Taking high school classes in a regular, day high school 

4 - GED or high school equivalency program 

5 - skill development program 

6 - on-the-job training program 

7 - apprenticeship program 

8 - vocational, business, or trade school 

9 - two-year junior or community college 

10 - four-year college or university 

Q409 (A14) - Will you be enrolled: 

1 - full-time or 

2 — part time? 

8 - DK 

. - LS 

Q410B (A8) - How about sometime further into the future — would you 
like to get more schooling? 

1 - yes 

2 - no 
8 - DK 
. - LS 

Q411 (A9) - What kind of school or college would you like to 

attend? 

1 - high school 

2 - vocational, business, or trade school 

3 - two-year junior or community college 

4 - four-year college or university 

5 - graduate or professional school 

8 - DK 

9 - RE 
. - LS 
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Q416 (A17) - Are you currently employed, either full-time or part- 
time? 

1 - yes 

2 - no 
9 - RE 
. - LS 

Q417 (A18) - Are you looking for work now? 

1 - yes 

2 - no 
9 - RE 
. - LS 

Q419 (A19) - Have you ever had a job for pay? 

1 — yes 

2 — no 
9 - RE 
. - LS 

Q430 (A30) — At your (main/last) job, (are/were) you? 

1 - an employee of a private company 

2 - a government employee, 

3 - self-employed in your own business, or 

4 — working without pay in a family business or farm? 

9 - RE 

. - LS 



Q503 (B3) - How likely is it that you will be serving in the 

military? Would you say: 

1 - definitely, 

2 - probably, 

3 - probably not, or 

4 — definitely not? 

8 - DK 

9 - RE 

. - BD 



54 



Q693 (D64) - To help me ask the next few questions correctly^ I 

need to know whether you are currently: 

1 - married, 

2 — widowed, 

3 - separated, 

4 - divorced, or 

5 - have you ever been married? 

8 - DK 

9 - RE 
. - BD 

Q698 (D70) - Have you ever taken a college entrance examination 

such as the PSAT (Preliminary Scholastic Aptitude Test) , the SAT 
(Scholastic Aptitude Test) , or the ACT (American College Testing 
Program) ? 

1 — yes 

2 — no 

8 - DK 

9 - RE 

. - BD 

Q699 (D71) - In the future do you plan to take a college entrance 
examination? 



1 


— 


yes 








2 


— 


no 








8 


- 


DK 








9 


- 


RE 








• 


— 


LS 








Q700 (D72) 


- 


What grades did you usually 


1 


- 


Mostly 


A' 


s 


(A numerical average of 90 


2 


— 


Mostly 


A' 


s 


and B' s (85-89) 


3 


— 


Mostly 


B' 


s 


(80-84) 


4 


— 


Mostly 


B' 


s 


and C's (75-79) 


5 


— 


Mostly 


C' 


s 


(70 — 74) 


6 


— 


Mostly 


C' 


s 


and D's (65-69) 


7 


— 


Mostly 


D' 


s 


and F' s (64 and below) 


8 


— 


DK 








9 


- 


RE 










- 


LS 









55 



Q701 (D73) - Was your high school program: 

1 - academic or college preparatory, 

2 - commercial or business training, 

3 — or vocational or technical? 

8 - DK 

9 - RE 
. - LS 

Have you taken or do you plan to take the following courses in high 
school : 

Q702 (D74A) - elementary algebra 

1 - taken 

2 - plan to take 

3 - not taken 

8 - DK 

9 - RE 
. - LS 

Q703 (D74B) - plane geometry 

1 - taken 

2 - plan to take 

3 - not taken 

8 - DK 

9 - RE 
. - LS 

Q704 (D74C) — business math 

1 - taken 

2 — plan to take 

3 - not taken 

8 - DK 

9 - RE 
. - LS 

Q705 (D74D) - computer science 

1 - taken 

2 - plan to take 

3 — not taken 

8 - DK 

9 - RE 
. - LS 
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Q706 (D74E) - intermediate algebra 



1 - taken 

2 - plan to take 

3 - not taken 

8 - DK 

9 - RE 
. - LS 

Q707 (D74F) - trigonometry 

1 - taken 

2 - plan to take 

3 - not taken 

8 - DK 

9 - RE 
. - LS 

Q708 (D74G) - calculus 

1 - taken 

2 - plan to take 

3 - not taken 

8 - DK 

9 - RE 
. - LS 

Q709 (D74H) - physics 

1 — taken 

2 — plan to take 

3 — not taken 

8 - DK 

9 - RE 
. - LS 
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Q713M (D77) — What is the highest grade or year of school or 

college that your mother completed? 

7 - less than 8th grade 

8 - 8th grade 

9 - 9th grade 

10 - 10th grade 

11 - 11th grade 

12 — 12th grade 

13 - 1st year college/ junior or community college/ vocational, 

business or trade school 

14 - 2nd year college/ junior or community college/ vocational, 

business or trade school 

15 — 3rd year of 4-year college (JR) 

16 - 4th year of 4-year college (SR) 

17 - 5th year college/lst year graduate or professional school 

18 - 2nd year graduate or professional school 

19 - 3rd year graduate or professional school 

20 - more than 3 years graduate/professional school 

98 - DK 

99 - RE 

- BD 

Q714 (D80) - Just to be sure we are representing all groups in our 

survey, please tell me whether you consider yourself . . . (If 

"HISPANIC" PROBE: Do you consider your race to be white, black, 

Asian, or American Indian?) 

1 — white? 

2 - black? 

3 - Asian or Pacific Islander? (Includes Chinese, Japanese, 
Filipino, Korean, Vietnamese, Pacific Islander, Asian Indian, or 
other Asian) 

4 - American Indian or Alaskan Native? 

8 - DK 

9 - BE 
. - BD 

Q715 (D80) - Are you of Hispanic background? 

1 - Yes 

2 - No 

8 - DK 

9 - RE 
. - BD 
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The following variables are taken from the DMDC cohort data set and 
were not changed in 1984. 

TYPE - Type of record on file at DMDC 

0 - no contact with MEPS 

1 - Record showing examination results 

2 - Enlistment into delayed entry program (DEP) 

3 - Enlistment to active duty 

4 - Discharged from DEP 

64 - No ASVAB score recorded 

. - LS (No social security number availaible with 
which to match YATS and cohort data) 

AFQTPCT - 01 to 99 according to scores. 

AGE - Age at time respondent was tested. 
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